AITopics | lunarlander environment

Collaborating Authors

lunarlander environment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Loss and Reward Weighing for increased learning in Distributed Reinforcement Learning

Holen, Martin, Andersen, Per-Arne, Knausgård, Kristian Muri, Goodwin, Morten

arXiv.org Artificial IntelligenceApr-25-2023

This paper introduces two learning schemes for distributed agents in Reinforcement Learning (RL) environments, namely Reward-Weighted (R-Weighted) and Loss-Weighted (L-Weighted) gradient merger. The R/L weighted methods replace standard practices for training multiple agents, such as summing or averaging the gradients. The core of our methods is to scale the gradient of each actor based on how high the reward (for R-Weighted) or the loss (for L-Weighted) is compared to the other actors. During training, each agent operates in differently initialized versions of the same environment, which gives different gradients from different actors. In essence, the R-Weights and L-Weights of each agent inform the other agents of its potential, which again reports which environment should be prioritized for learning. This approach of distributed learning is possible because environments that yield higher rewards, or low losses, have more critical information than environments that yield lower rewards or higher losses. We empirically demonstrate that the R-Weighted methods work superior to the state-of-the-art in multiple RL environments.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2304.12778

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reinforcement Learning(Part-1): Deep Q Learning using Tensorflow2

#artificialintelligenceJun-11-2022, 19:15:14 GMT

In this tutorial, we will be discussing what is Q learning and how to Implement Q learning using Tensorflow2. Q Learning is one of the most popular reinforcement learning algorithms, it is an off-policy reinforcement learning(RL) that finds the best action for the given state. Q learning is considered off-policy reinforcement learning because Q learning is not dependent on current policy. It learns from actions that are outside the current policy, like taking random actions, therefore the policy is not needed. Here Q stands for Quality, which means how useful the given action is for the current state to get some future reward.

agent, learning, lunarlander environment, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Policy Gradient(Reinforce)using Tensorflow2

#artificialintelligenceFeb-10-2022, 18:15:18 GMT

In this article, we will be discussing what is Policy gradients and how to implement policy gradients using tensorflow2. There are three main points in the policy gradient algorithm. By considering the above three principles, we can implement the policy gradient using TensorFlow. We are dividing our source code into two parts. Policy gradient takes the current state as input and outputs probabilities for all actions.

lunarlander environment, sthanikamsanthosh, tensorflow2, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Dueling Double Deep Q Learning with Tensorflow

#artificialintelligenceJan-31-2022, 13:25:15 GMT

In this article, we will be going through what is Dueling Double Deep Q Learning and how to implement it in Tenroflow. Dueling Double Deep Q learning is the combination of Dueling Deep Q Learning and Double Deep Q Learning. Let's try to understand what is Dueling Deep Q learning and Double Deep Q Learning. One of the drawbacks of the DQN algorithm is that it overestimates the true rewards; the Q-values think the agent is going to obtain a higher return than what it will obtain in reality. This overestimation is due to the presence of Max of Q value for the next state in the Q learning update equation.

agent, learning, neural network, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning

Sedlmeier, Andreas, Gabor, Thomas, Phan, Thomy, Belzner, Lenz, Linnhoff-Popien, Claudia

arXiv.org Machine LearningDec-31-2019

Robustness to out-of-distribution (OOD) data is an important goal in building reliable machine learning systems. Especially in autonomous systems, wrong predictions for OOD inputs can cause safety critical situations. As a first step towards a solution, we consider the problem of detecting such data in a value-based deep reinforcement learning (RL) setting. Modelling this problem as a one-class classification problem, we propose a framework for uncertainty-based OOD classification: UBOOD. It is based on the effect that an agent's epistemic uncertainty is reduced for situations encountered during training (in-distribution), and thus lower than for unencountered (OOD) situations. Being agnostic towards the approach used for estimating epistemic uncertainty, combinations with different uncertainty estimation methods, e.g. approximate Bayesian inference methods or ensembling techniques are possible. We further present a first viable solution for calculating a dynamic classification threshold, based on the uncertainty distribution of the training data. Evaluation shows that the framework produces reliable classification results when combined with ensemble-based estimators, while the combination with concrete dropout-based estimators fails to reliably detect OOD situations. In summary, UBOOD presents a viable approach for OOD classification in deep RL settings by leveraging the epistemic uncertainty of the agent's value function.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2001.00496

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback